REPORT  DOCUMENTATION  PAGE 


Form  Approved 

OMB  No.  0704-0188 


Public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  data  sources, 

gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection 

of  information,  including  suggestions  for  reducing  this  burden  to  Washington  Headquarters  Service,  Directorate  for  Information  Operations  and  Reports, 

1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington,  VA  22202-4302,  and  to  the  Office  of  Management  and  Budget, 

Paperwork  Reduction  Project  (0704-0188)  Washington,  DC  20503. 

PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ADDRESS. 


1.  REPORT  DATE  (DD-MM-YYYY) 
OCTOBER  2009 


4.  TITLE  AND  SUBTITLE 


2.  REPORT  TYPE 

Conference  Paper  Preprint 


CAMUS:  AUTOMATICALLY  MAPPING  CYBER  ASSETS  TO  MISSIONS 
AND  USERS  (PREPRINT) 


3.  DATES  COVERED  (From  -  To) 

May  2008  -  July  2009 


5a.  CONTRACT  NUMBER 

FA8750-08-C-0I66 


5b.  GRANT  NUMBER 


5c.  PROGRAM  ELEMENT  NUMBER 

65502D 


6.  AUTHOR(S) 


5d.  PROJECT  NUMBER 


Jason  K.  Kopylec  and  John  R.  Goodall 


5e.  TASK  NUMBER 


5f.  WORK  UNIT  NUMBER 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Applied  Visions,  Inc. 

6  Bay  view  Ave. 

Northport,  NY  1 1768-1502 


9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

AFRL/RIEA 
525  Brooks  Road 
Rome  NY  13441-4505 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


10.  SPONSOR/MONITOR'S  ACRONYM(S) 

N/A 


11.  SPONSORING/MONITORING 
AGENCY  REPORT  NUMBER 

AFRL-RI-RS-TP-2009-23 


12.  DISTRIBUTION  AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited.  PA#  88ABW-2009-1876  Date  Cleared:  2- June-2009 


13.  SUPPLEMENTARY  NOTES 

This  work,  resulting  in  whole  or  in  part  from  Department  of  the  Air  Force  contract  number  FA8750-08-C-0166,  has  been  submitted 
to  MILCOM  2009,SIMA  Workshop,  to  be  held  in  Boston,  MA  19-21  Oct  2009.  If  this  work  is  published,  MILCOM  may  assert 
copyright.  The  United  States  has  for  itself  and  others  acting  on  its  behalf  an  unlimited,  paid-up,  nonexclusive,  irrevocable  worldwide 
license  to  use,  modify,  reproduce,  release,  perform,  display,  or  disclose  the  work  by  or  on  behalf  of  the  Government.  All  other  rights 
are  reserved  by  the  copyright  owner. 


14.  ABSTRACT 

This  research  advances  Cyber  Situation  Management  by  proposing  methods  for  automated  mapping  of  Cyber  Assets  to  Missions  and 
Users  (CAMUS).  To  enable  accurate  and  efficient  cyber  incident  mission  impact  assessment,  a  CAMUS  ontology  that  defines 
entities,  relationships  and  attributes  (ERAs)  associated  with  them  has  been  drafted.  Methods  for  fusing  data  from  multiple  data 
sources  have  been  developed  alongside  an  ontology-based  system  to  populate  the  model  using  existing  network  data  sources.  The 
CAMUS  system  demonstrates  how  commonly  available  data  sources  can  be  rapidly  collected,  correlated,  and  fused  to  automatically 
map  cyber  assets  to  the  users  who  depend  on  them,  to  the  missions  they  support,  and  to  services  they  provide.  Also  discussed  are  the 
technical  architecture  and  challenges  to  such  an  approach. 


15.  SUBJECT  TERMS 

Cyber  Situation  Awareness,  Cyber  Impact  Assessment,  Network  Defense,  Global  Information  Grid  (GIG),  Level  3  Fusion 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 

18.  NUMBER 

19a.  NAME  OF  RESPONSIBLE  PERSON 

ABSTRACT 

OF  PAGES 

George  P.  Tadda 

a.  REPORT 

b.  ABSTRACT 

c.  THIS  PAGE 

UU 

8 

19b.  TELEPHONE  NUMBER  (Include  area  code) 

u 

u 

u 

N/A 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std.  Z39.18 


UNCLASSIFIED 


ID#  900355 


CAMUS:  AUTOMATICALLY  MAPPING  CYBER  ASSETS  TO  MISSIONS  AND  USERS 

Jason  K.  Kopylec  and  John  R.  Goodall 
Applied  Visions,  Inc. 

Secure  Decisions  Division 
Northport,  NY 


ABSTRACT 

This  research  advances  Cyber  Situation  Management  by 
proposing  methods  for  automated  mapping  of  Cyber 
Assets  to  Missions  and  Users  (Camus).  To  enable  accurate 
and  efficient  cyber  incident  mission  impact  assessment,  a 
Camus  ontology  that  defines  entities,  relationships  and 
attributes  (ERAs)  associated  with  them  has  been  drafted. 
Methods  for  fusing  data  from  multiple  data  sources  have 
been  developed  alongside  an  ontology-based  system  to 
populate  the  model  using  existing  network  data  sources. 
The  Camus  system  demonstrates  how  commonly  available 
data  sources  can  be  rapidly  collected,  correlated,  and 
fused  to  automatically  map  cyber  assets  to  the  users  who 
depend  on  them,  to  the  missions  they  support,  and  to  the 
services  they  provide.  Also  discussed  are  the  technical 
architecture  and  challenges  to  such  an  approach. 

INTRODUCTION 

To  effectively  remediate  a  cyber  asset  compromise, 
analysts  need  to  clearly  understand  the  relationships 
between  the  compromised  asset  and  the  affected  missions 
and  users.  If  all  that  is  known  about  a  compromised  host  is 
its  IP  Address,  there  is  no  evidence  to  project  the 
cascading  effects.  Today’s  network  analysts  have  a  limited 
view  of  the  roles  cyber  assets  play  in  the  overall  enterprise. 
Without  this  information,  an  analyst  cannot  accurately 
prioritize  and  assign  resources  to  perform  remediation. 

The  objective  of  this  research  is  to  improve  cyber  situation 
management  by  developing  an  automated  mapping  of 
Cyber  Assets  to  Missions  and  Users  (Camus)  that 
facilitates  accurate  and  efficient  Cyber  Incident  Mission 
Impact  Assessment  (CIMIA)  [9].  Central  to  the  effort  is 
the  development  of  the  Camus  system,  capable  of 
automated  relationship  discovery  between  cyber  assets, 
missions  and  users.  To  derive  needed  contextual 
information  in  an  automated  way,  semantic  web  concepts 
are  applied  to  model  and  automatically  fuse  the  needed 


information.  The  Camus  system  integrates  a  number 
common  network  feeds  demonstrating  how  existing  data 
sources  can  be  used  in  new  ways  to  provide  contextual 
mission  information. 

RELATED  WORK 

Much  of  the  grounding  for  Camus  comes  from  Salerno’s 
[16]  [15]  Air  Force  Situational  Awareness  Model 
(AFSAM).  This  model  describes  the  path  that  data  takes  to 
become  information  that  can  be  consumed  by  analysts  for 
improved  situation  management.  Of  most  interest  to 
Camus  is  the  portion  of  the  AFSAM  labeled  as 
“knowledge  of  us,“  which  provides  contextual  information 
about  the  operational  environment  that  critical 
infrastructure  supports.  Tadda  et  al.  [17]  refined  the 
general  AFSAM  and  applied  it  directly  to  the  cyber 
domain,  resulting  in  the  Cyber  SA  Model.  Within  the 
Cyber  SA  Model,  the  “knowledge  of  us”  required  for 
situation  management  is  an  accurate  understanding  of  how 
operations  are  impacted  when  there  are  degradations  and 
compromises  in  the  cyber  infrastructure. 

The  aim  of  Camus  is  to  provide  this  information  in  a 
continually  up-to-date,  automated  and  scalable  way  that  is 
usable  by  both  human  analysts  as  well  as  other  Cyber  SA 
systems.  The  work  of  Holsopple  et  al.  [11],  also  grounded 
in  the  Cyber  SA  model,  develops  a  Virtual  Terrain  that 
models  the  network  and,  to  a  limited  extent,  takes  mission 
context  into  account.  Problematically,  their  mission-related 
information  must  be  manually  added  by  analysts.  It  is  this 
manual  data  entry  that  Camus  attempts  to  automate. 
Grimaila  and  Fortson  [9]  shift  the  focus  on  situation 
management  away  from  cyber  assets  and  instead  to 
information  assets.  They  discuss  that  what  is  truly  valuable 
is  the  information  that  resides  on  the  hardware  and  its 
confidentiality,  integrity  and  availability.  They  propose  a 
Cyber  Damage  Assessment  Framework  that  requires  the 
manual  definition  and  prioritization  of  both  operational 
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processes  and  information  assets.  Bryant  and  Grimaila  [3] 
show  that  there  are  a  number  of  pitfalls  when  collecting 
information-centric  data  and  that  much  of  it  is  unavailable 
electronically.  The  Camus  approach  can  greatly  aid  the 
collection  and  automation  of  information  assets,  although 
this  is  currently  left  to  future  work. 

Work  by  Gomez  et  al.  [8]  in  the  domain  of  sensor-mission 
assignment  applied  a  similar  approach  to  Camus  in  the 
area  of  automated  assignment  of  intelligence,  surveillance 
and  reconnaissance  (ISR)  assets  to  specific  military 
missions.  Their  Missions  and  Means  Framework  (MMF) 
ontology  closely  parallels  the  Camus  ontology  including 
concepts  such  as  missions,  operations,  tasks,  capabilities 
and  systems.  Lewis  et  al.  [13]  propose  their  own  mission 
reference  model  and  are  tackling  the  mapping  of  cyber 
assets  to  missions  from  a  mathematical  constraint 
satisfaction  approach.  What  Lewis  et  al.  does  not  comment 
on  is  the  practical  matter  of  collecting  and  fusing  the  data 
needed  to  support  their  mathematical  models. 

MOTIVATION 

Barger  [1]  describes  a  need  for  improved  cyber  situation 
management  that  is  based  on  a  shared  understanding 
between  mission  commanders  and  network  analysts  about 
how  compromises  to  cyber  assets  will  affect  mission 
essential  functions  (MEFs).  Network  security  management 
systems  do  little  to  facilitate  this  common  operating 
picture.  Today’s  cyber  defenders  often  have  little 
contextual  information  about  a  compromised  asset  beyond 
its  IP  address  and  an  Intrusion  Detection  System  (IDS) 
alert  description.  Knowing  only  an  IP  Address,  the 
affected  machine  could  be  the  desktop  belonging  to  a 
janitor  that  maintains  an  inventory  of  cleaning  supplies. 
On  the  other  hand,  it  could  be  a  file  server  that  supports 
time-critical  communication  between  commanders  in 
theater.  In  order  to  properly  respond  to  cyber 
compromises,  analysts  need  to  know  who  uses  the 
compromised  asset  and  what  it  is  used  for.  Only  then  can 
its  criticality  be  determined  and  appropriate 
countermeasures  be  taken.  In  the  case  of  the  janitor’s 
laptop,  perhaps  no  action  is  needed,  whereas  for  the 
critical  file  server,  the  information  that  was  lost  might  have 
critical  effects  to  the  success  of  supported  missions. 

The  primary  operational  obstacle  is  a  lack  of  existing  data 
sources  that  accurately  map  cyber  assets  to  the  missions 
they  support.  Even  if  cyber  assets’  functions  are 


documented  when  initially  put  on  the  network,  that 
information  quickly  becomes  obsolete  as  the  network  is 
reconfigured  over  time.  In  current  operations,  mapping 
cyber  assets  to  missions  and  users  is  a  manual,  time- 
consuming,  error-prone,  and  expensive  process,  so  it  is 
rarely  attempted. 

Even  if  manual  methods  are  employed,  often  the  actual  use 
of  the  network  in  operation  is  much  different  than  its 
original  architecture.  Adding  to  the  difficulty  is  that  the 
networks’  interdependencies  are  so  numerous  and  complex 
that  comprehending  the  mappings  is  impossible  without 
proper  formatting  and  display. 

An  optimal  solution  to  these  problems  should  provide  the 
needed  information  to  enable  effective  CIMIA.  The  cyber 
asset  to  mission  mappings  should  be  trusted  and  accurate, 
maintaining  provenance  to  trace  back  to  original  sources. 
Moreover,  the  information  should  be  targeted  to  the 
particular  role  of  the  user.  For  example,  the  information 
needed  by  a  commander  to  evaluate  the  go/no-go  status  of 
his  missions  is  very  different  from  the  picture  needed  by  a 
network  analyst  to  determine  how  to  improve  redundancy 
and  resiliency  of  the  enterprise  network.  The  commander 
needs  a  deep  understanding  of  the  missions  he  oversees 
and  is  less  interested  in  the  bits  and  bytes  of  the  underlying 
computer  network.  Conversely,  the  network  analyst  needs 
a  detailed  knowledge  of  how  the  network  is  configured 
and  running;  mission  and  task-related  information  is  only 
used  to  determine  asset  criticality  and  to  ensure  that  the 
supporting  infrastructure  is  in  place  and  working  properly. 
An  optimal  automated  solution  should  be  flexible  to  this 
variation  in  role-based  granularity. 

An  even  tougher  challenge  is  to  assign  dependencies  and 
criticality  metrics  to  the  relationships.  It  is  one  thing  to  say 
that  a  particular  file  server  is  used  during  a  mission  by  a 
particular  person.  A  much  deeper  knowledge  about 
mission  requirements  is  needed  to  determine  automatically 
that  the  file  server  is  critical  and  depended  on  for 
successful  execution. 

TECHNICAL  APPROACH 

The  primary  goal  of  the  Camus  technical  solution  is  to 
meet  these  challenges  and  provide  the  needed  context  to 
support  automated  CIMIA.  Armed  with  such  a  technology, 
the  critical  role  that  cyber  assets  have  in  mission  success 
can  be  better  understood.  Beyond  these  research  ideals, 
there  are  also  practical  requirements  that  Camus  should  be 
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relevant  and  operationally  feasible  in  today’s  large  and 
dynamic  networks.  The  Camus  approach  is  grounded  in 
the  idea  that  the  needed  data  does  exist  in  digital  format, 
but  is  in  disparate  locations  and  formats.  Hence,  much  of 
the  exercise  to  derive  asset  and  mission  relationships 
becomes  a  data  mining,  inference,  and  fusion  task. 

The  Camus  system  relies  on  an  ontology-based  semantic 
approach  to  data  integration  and  fusion,  similar  to  the 
concepts  discussed  in  Yoakum-Stover  and  Malyuta  [19]. 
The  ontologies  were  designed  with  SMEs  in  terms  of 
entities,  relationships  and  attributes  (ERAs).  The  resulting 
ERAs  were  then  translated  into  semantic  ontologies,  using 
the  methodology  in  Fahad  [6]. 

To  build  the  Camus  technology  solution,  a  system 
architecture  has  been  developed  along  with  a  software 
platform  based  on  concepts  from  the  semantic  web.  The 
semantic  web  uses  ontologies  as  a  structured 
representation  of  ERAs  of  a  domain.  The  Camus  system 
uses  common  semantic  web  tools  Protege  and  the  XML- 
based  Web  Ontology  Language  (OWL)  to  represent  its 
ontologies.  Figure  1  graphically  represents  the  high-level 
core  of  the  Camus  ontology. 


Mission 


supports 


Cyber 

y/^CapabilityV  . , 

^  ^  \proviaes 

User  Cyber  Asset 


Figure  1.  High-level  Camus  ontology  mapping  cyber 
assets  to  missions  and  users. 


The  core  Camus  ontology  depicts  the  semantic 
relationships  between  missions,  cyber  capabilities,  users 
and  cyber  assets.  To  implement  the  Camus  system,  this 
core  ontology  is  extended  to  encompass  the  level  of  data 
granularity  needed  for  a  particular  operator  role.  For 
example,  a  network  analyst  would  require  much  more 
detail  about  cyber  assets,  including  the  applications,  data 
and  hosts  and  devices  interacting  with  one  another,  but 
only  needs  a  rough  approximation  of  missions  and  their 
criticality.  Conversely,  a  commander  may  need  only  basic 
information  about  assets,  but  require  much  more 


information  about  the  essential  and  specified  mission  tasks 
that  are  under  his  control. 

Building  from  the  core  ontology,  data  sources  were 
identified  that  together  can  populate  the  model  and  provide 
the  needed  information.  Instead  of  populating  the  ontology 
all  at  once,  the  Camus  technical  approach  involves 
translating  data  feeds  that  supply  basic  relationships  for 
small  portions  of  the  Camus  ontology,  piece-by-piece. 
Even  if  portions  of  the  ontology  cannot  be  populated  from 
available  data,  the  logical  reasoning  capabilities  implicit  in 
OWL  can  infer  indirect  relationships  and  fill  in  this 
missing  information. 

To  illustrate  the  approach,  take  an  analyst  who  needs  to 
know  how  users’  workstations  are  associated  with  which 
departments  on  a  large  enterprise  network  and  has  access 
two  data  sources,  FTP  logs  and  an  LDAP  server.  The  FTP 
logs  and  LDAP  query  results  contain  items  that  look  like 
the  sample  in  Table  1. 


Table  1.  Sample  data  records  for  FTP  Logs  and  DNS 
Dumps,  which  include  user  and  cyber  asset  information. 


FTP  Log 

LDAP  query 

...  jsmith@100. 10.20.4  ... 

...  sjones@100. 10.20.6  ... 

...  llaurel@  100. 10.20.9  ... 

...j smith  Logistics... 

. .  .llaurel  Adminstrative. . . 
...sj ones  Finance... 

To  model  this  information,  a  base  ontology  is  created  that 
contains  entities  [User],  [Department]  and  [IP Address]. 
Added  to  that  are  semantic  relationships  to  connect  the 
entities:  [User  isMemberOf  Department],  [Workstation 
isUsedBy  User]  and  [Workstation  supports  Department]. 
The  relationship  that  the  analyst  really  wants  is  the  last 
one,  [Workstation  supports  Department],  but  it  is  not 
explicit  in  the  data  sources.  To  derive  this  information,  the 
data  sources  are  used  to  instantiate  the  relationships  they 
do  represent  explicitly,  such  as  [100.10.20.4  isUsedBy 
j smith]  and  [j smith  isMemberOf  Logistics].  Note  that  each 
of  the  data  sets  is  only  responsible  for  the  relationships  it 
can  provide.  Using  [User]  as  an  alignment  point,  i.e.  an 
entity  that  is  common  between  relationships,  the  semantics 
of  ontology  can  now  infer  the  relationship  that  the  analyst 
is  really  interested  in,  [100.10.20.4  supports  Logistics],  as 
shown  in  Figure  2. 
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Figure  2.  A  sample  instantiation  of  an  ontology  using  two 
separate  data  sources.  The  files  have  been  directly 
translated  into  semantic  relationships  (solid  lines), 
allowing  for  automated  inference  of  the  relationship  of 
interest  [100.10.20.4  isUsedBy  Logistics]  (dashed  line). 

This  capability  is  powerful  because  many  data  sources  can 
be  used  to  populate  small  portions  of  a  base  ontology  and 
shared  concepts  become  alignment  points  and  fuse  the 
data.  The  translation  from  raw  data  sources  into  the 
ontology  model  can  be  done  in  one  of  three  ways: 

•  Direct  Translation  -  Relationships  can  be  drawn 
directly  from  the  data;  for  example 
...jsmith@100.10.20.4...  can  be  directly  converted  into 
[jsmithuses  100.10.20.4]. 

•  Inferred  Translation  -  Relationships  can  be  assigned 
from  data  using  heuristics  or  statistical  methods;  for 
example,  to  assign  the  relationship  [User  dependsOn 
Workstation]  if  there  is  no  data  source  that  represents 
this  dependency  explicitly,  a  heuristic  rule  can  be 
applied  so  that  if  a  user  always  logs  on  to  the  same 
single  workstation,  then  the  inference  is  made  that  that 
user  dependsOn  that  workstation. 

•  Ontology-to-Ontology  Translation  -  If  the  data  source 
has  its  own  model  or  schema,  e.g.  Microsoft 
Operations  Framework  (MOF)  or  the  Universal  Joint 
Task  List  (UJTL),  entities  can  be  aligned  at  the 
ontology  level  by  defining  alignment  points;  instances 
of  one  ontology  are  then  automatically  treated  as 
corresponding  instances  in  the  second. 

These  techniques  define  a  process  for  ontology  fusion, 
bringing  together  disparate  network  data  sources  to  define 
and  infer  mappings  between  cyber  assets,  mission  and 
users.  By  modeling  the  needed  information  in  an 
ontological  format,  data  fusion  happens  automatically  (see 
Boury-Brisset  [2]).  The  results  can  then  be  coupled  with 
other  SA  systems  to  provide  programmatic  access  to  the 
mission  mappings  and  provide  role-based  information 
visualization  views  that  depict  the  needed  information. 


ARCHITECTURE 

Much  of  the  development  effort  has  been  to  build  a  system 
that  implements  the  semantic  functionality  and  situation 
awareness  that  the  Camus  technical  approach  describes. 
The  Camus  system  integrates  a  number  of  technologies 
that  are  used  in  the  biological  sciences  and  digital  content 
management  domains,  including  OWL,  Jena,  Lucene, 
Protege,  Servlet-based  APIs  and  web  based  visualizations. 
Base  ontologies  are  modeled  in  Protege  and  exported  as 
XML-based  OWL  files.  These  base  ontologies  are  then 
coupled  with  easily  understood  JavaScripts  that  define 
where  to  find  and  how  to  translate  available  data  sources 
into  an  instantiation  of  the  base  ontology.  Within  Camus, 
the  base  ontologies  and  their  corresponding  scripts  are 
referred  to  as  ontology  fuselets.  Multiple  ontology  fuselets 
can  be  combined  into  a  master  fusion  ontology  that  defines 
the  alignment  points  among  the  base  ontologies. 

Figure  3  below  shows  the  conceptual  system  architecture, 
which  consists  of  three  main  components:  Data 
Integration,  Information  Fusion  and  Knowledge 
Management. 
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Figure  3.  Camus  system  architecture  overview. 


The  Data  Integration  module  takes  the  ontology  fuselets  as 
input  and  uses  them  to  parse  raw  data  sources  into 
ontology  instances.  The  data  sources  can  come  in  an  array 
of  formats  such  as  semi-structured  text  files,  databases  or 
processes  that  publish  alert  or  log  information  over  the 
network.  Once  data  sources  are  translated  into  an  ontology 
representation,  they  are  passed  to  the  Information  Fusion 
module  which  uses  the  alignment  points  in  the  master 
fusion  ontology.  It  semantically  couples  the  individual 
pieces  together,  performing  automated  data  fusion.  The 
resulting  fully  instantiated  and  fused  ontology  contains  a 
complete  mapping  of  relationships  between  cyber  assets, 
missions  and  users.  To  handle  issues  of  ontology 
scalability,  the  Camus  system  implements  ontology 
caching  mechanisms  based  on  the  concepts  of  Karnstedt  et 
al.  [12].  The  complete  ontology  can  be  extremely  large,  so 
it  is  persisted  using  high-performance  external  Resource 
Description  Framework  (RDF)  stores  and  indexes.  This 


4 


UNCLASSIFIED  ID#  900355 


cache  provides  fast  and  robust  querying  mechanisms, 
which  are  accessed  by  the  third  module,  Knowledge 
Management. 

The  Knowledge  Management  module  consists  of  two 
parts,  web  APIs  to  access  the  ontology  programmatically, 
and  visualization  capabilities  for  displaying  mission 
context  information  directly  to  operators.  The 
programmatic  APIs  are  web  based  for  easy  coupling  to 
outside  systems,  so  that  the  mission  context  information 
contained  in  the  ontology  cache  can  be  made  available  to 
external  Cyber  SA  systems.  The  user  interface  provides  a 
point-and-click  visual  interface  to  Camus  information. 
When  an  item  of  interest  is  clicked,  such  as  an  IP  Address, 
a  client-side  query  is  created.  The  client-side  query  is 
parsed  by  the  Knowledge  Management  component  and 
passed  to  the  Information  Fusion  engine.  The  cache  is 
consulted  and  updated,  if  needed,  by  the  Data  Integration 
module,  which  uses  the  ontology  fuselets  to  retrieve  the 
needed  information  from  the  original  data  sources. 

The  mission  context  results  are  finally  packaged  up  as 
OWL  or  GraphML  and  returned  to  the  client  via  HTTP  for 
further  manipulation  and/or  display  to  the  operator.  The 
returned  ontologies  can  also  include  visual  attributes  to  aid 
visualization,  similar  to  Rahman  et  al.  [14].  Carroll  [4] 
presents  a  number  of  applicable  role-focused  views  for 
displaying  visual  mission  hierarchies  and  the  supported 
infrastructure.  A  number  of  performance  enhancements 
have  been  made  to  the  fusion  engine  and  cache  to  ensure 
that  user  queries  are  returned  within  reasonable  time  to 
keep  the  user  interface  running  smoothly  and  to  parse  large 
data  sources  rapidly. 

APPLICATION 

To  illustrate  the  approach,  a  Camus  prototype  was 
developed  that  displays  mission  context  information 
coupled  to  an  existing  intrusion  detection  system.  If  an 
organization  has  access  to  detailed  electronic  mission 
planning  specifications,  they  can  be  easily  integrated  into 
the  Camus  system.  In  practice  though,  the  network  defense 
community  does  not  have  access  to  mission  specifications 
and  tasks,  due  to  accessibility  restrictions  and 
classification.  So  the  Camus  system  uses  the  enterprise 
organizational  chart  as  an  adequate  baseline  for 
representing  users’  roles  and  organizational  missions.  The 
organizational  chart  is  an  easy  to  access,  usually 
unclassified  and  regularly  updated  document  which  maps 


people  to  their  roles,  departments  and  superiors.  What  the 
organization  chart  does  not  show,  nor  does  any  other 
common  network  data  source,  are  direct  mappings  of 
cyber  assets  to  specific  missions.  To  determine  this 
information,  we  add  network  data,  like  user  logs  and 
network  traffic,  which  can  be  used  to  infer  cyber  asset-to- 
mission  relationships.  If  the  system  can  deduce  how  users 
support  portions  of  the  organization  and  also  which 
machines  they  regularly  access,  it  can  show  a  reasonable 
approximation  of  how  compromised  cyber  assets  may 
affect  the  organization.  Here  [Users]  are  an  alignment 
point  for  inferring  cyber  asset-to-mission  dependencies. 

To  demonstrate  the  Camus  system  capabilities,  an  existing 
network  security  data  set  of  network  traffic  and  system 
logs  were  mined  for  asset,  mission  and  user  related  data. 
LDAP  provides  a  list  of  users,  their  roles  and  departments. 
FTP  and  Unix  logs  were  processed  to  determine  the  logical 
network  topology  and  user  social  network.  These  host-to- 
host  communication  networks  provide  information  such  as 
which  machines  regularly  use  a  particular  mail  server. 
Armed  with  these  basic  data  sources  -  LDAP,  NetFlow 
traffic  and  user  logs  -  fuselets  were  created  for  each,  as 
well  as  a  fusion  ontology  to  align  the  common  features, 
such  as  username  and  IP  Address. 

The  ontology  fuselets  automatically  parse  the  source  data 
to  populate  and  store  the  mission  ontology.  With  the 
populated  model  mapping  cyber  assets  to  missions  and 
users,  the  next  step  was  to  demonstrate  how  that 
information  can  be  used  to  provide  improved  mission 
context  to  analyze  and  remediate  cyber  asset  compromises. 
A  web  application  was  built  that  displays  IDS  Alerts 
(provided  by  Snort  IDS)  that  links  IP  Addresses  to  Camus 
visualization  views.  When  the  user  clicks  on  an  IP  Address 
of  interest,  the  Camus  fusion  engine  consults  the  cache  and 
parses  any  needed  raw  data  files  on  the  fly.  The  results  are 
formatted  in  HTML  and  returned  back  to  the  client  as 
easily  understandable  graphics.  These  are  displayed  to  the 
user  through  the  browser,  providing  on-demand  mission 
context  information,  as  shown  in  Figure  4. 
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Figure  4.  Camus  user  interface  displaying  mission  context 
information  for  an  attacked  cyber  asset. 

In  preparing  the  system,  there  were  a  number  of  run-time 
performance  challenges.  These  were  focused  in  two  areas: 
1)  scaling  the  amount  of  data  that  can  be  parsed  and 
cached  and  2)  ensuring  that  the  user  experience  was  fast 
enough  for  normal  browsing.  The  corpus  of  NetFlow 
traffic  used  in  the  demonstration  has  over  ten  thousand 
unique  IP  Addresses  and  is  over  one  gigabyte  in  size.  A 
number  of  high  performance  indexing  and  custom  filtering 
mechanisms  were  implemented  to  increase  throughput. 
The  Camus  fusion  engine  was  able  to  parse  and  fuse  all  of 
this  data  efficiently  using  a  commodity  laptop  and  small 
memory  footprint.  The  web  interface  reacts  within  normal 
web  browsing  response  times  that  range  between  near 
instantaneous  for  simple  queries  to  less  than  ten  seconds 
for  very  complex  queries. 

From  the  Camus  user  interface,  it  is  easy  to  see  how 
attacked  assets  support  specific  users,  portions  of  the 
organization  and  other  cyber  assets.  For  example,  it  was 
immediately  apparent  which  departments  authenticate  to  a 
particular  domain  server  and  which  web  servers  were  most 
widely  used  by  external  hosts.  The  system  successfully 
meets  the  requirements  of  providing  flexible  and  rapid 
integration  of  a  number  of  disparate  data  sources  to 
automatically  map  cyber  assets  to  the  missions  and  users 
they  support. 

FUTURE  WORK 

The  next  steps  are  to  augment  the  existing  Camus  system 
with  more  sophisticated  network  management  data 
sources,  such  as  configuration  management  databases 
(CMDB)  and  cyber  security  monitoring  systems.  In 
addition,  our  research  team  is  working  on  an  expanded 


mission  ontology  that  uses  military  planning  standards, 
such  as  UJTL,  to  better  model  tasks  and  missions. 

The  Camus  system  will  also  store  additional  provenance 
information  for  the  inferred  relationships.  This  provenance 
enables  dynamic  drill-in  to  see  original  data  sources  that 
were  used  to  create  an  asset-mission  mapping.  This 
provenance  will  be  provided  so  an  analyst  using  Camus 
can  corroborate  findings  and  improve  trust  and  reliability 
in  the  system. 

In  addition  to  provenance,  capabilities  will  be  added  to 
assign  metrics  to  the  relationships.  This  enables  dynamic 
computation  of  metrics  such  as  mission  criticality  and 
redundancy.  Chew  et  al.  [5]  discuss  how  network  security 
metrics  should  be  aligned  closely  with  the  missions  of  an 
agency.  Once  these  metrics  are  captured  for  portions  of  the 
mission  ontology,  they  are  then  propagated  throughout  the 
rest  of  the  model  using  conditional  probability  methods 
and  improve  inference  and  automated  mapping  results. 
These  metrics  will  also  include  calculations  of  risk,  similar 
to  the  work  of  Watters  et  at.  [18],  which  proposes  methods 
for  calculating  risk  based  on  cyber  asset  mission 
dependencies. 

Finally,  continued  improvements  will  be  made  to  the 
Camus  system  API  and  user  interface.  Adequate 
visualization  of  large  ontologies  is  an  open  and  active  area 
of  research  and  development  throughout  the  semantic  web 
community. 

CONCLUSION 

Camus  advances  the  state-of-the-art  in  situation 
management  by  providing  essential  ‘knowledge  of  us’  to 
the  Cyber  SA  Model.  Methods  for  automatically  mapping 
cyber  assets  to  the  mission  and  users  that  rely  on  them 
were  discussed.  This  capability  is  essential  for  CIMIA  and 
will  play  an  increasingly  crucial  role  as  cyber  operations 
become  increasingly  pervasive  and  ubiquitous.  The  Camus 
technical  approach  uses  ontology  fusion  and  emerging 
semantic  web  tools  to  parse  disparate  data  sources  into  a 
unified  model  of  domain  entities,  relationships  and 
attributes.  Three  methods  were  explained  for  converting 
available  data  sources  into  a  semantic  ontology 
representation,  namely  direct,  inferred  and  ontology-to- 
ontology  translation.  The  Camus  software  solution  builds 
on  these  methods  to  bridge  the  gaps  between  data, 
information  and  knowledge.  The  resulting  system  provides 
context  directly  to  analysts  or  to  other  cyber  situation 
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awareness  systems.  In  addition,  it  displays  mission  critical 
information  to  users,  improving  overall  situation 
awareness.  The  Camus  system  has  been  demonstrated  with 
readily  available  data  sources  and  found  it  to  be  both 
operationally  grounded  and  reasonably  scalable.  Overall, 
Camus  illustrates  that  practical,  accurate,  and  automated 
cyber  mission  impact  information  is  within  reach 
throughout  the  cyber  defense  and  network  management 
community. 
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